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Introduction 


A  large  percentage  of  the  practical  numerical  algorithms  in  use  today 
require  some  information  about  the  actual  floating-point  number  system  on 
which  they  are  implemented.  For  example,  a  zero  finder  must  use  some  sort 
of  "machine  epsilon"  to  determine  when  it  has  found  a  "zero".  An  iterative 
improvement  subroutine  in  a  linear  system  solver  must  stop  iterating  when 
the  corrections  no  longer  affect  the  answer.  Sane  other  algorithms  which 
require  this  type  of  information  are:  eigenvalue-eigenvector  routines, 
ordinary  differential  equation  solvers,  function  minimizers,  etc. 

Usually  this  information  is  supplied  to  the  algorithm  in  one  of  two 
ways:  It  is  either  passed  as  a  parameter  or  it  is  imbedded  in 
one  or  more  constants.  In  the  first  case,  each  user  is  faced  with  the 
problem  of  understanding  another  confusing  parameter  in  the  calling  sequence 
and  he  is  likely  to  not  know  what  to  use  for  a  "machine  epsilon".  In  the 
second  case,  the  "magic"  numbers  in  a  program  are  often  not  understood  by 
people  reading  or  translating  the  program.  When  the  subroutine  is  moved 
from  one  machine  to  another,  these  numbers  are  seldom  changed  to  reflect 
the  properties  of  the  new  machine  —  even  when  the  author  of  the  original 
program  provides  explicit  comments  in  the  program  telling  what  the  constants 
mean  and  how  to  change  them. 

Since  one  of  the  original  motivations  for  designing  and  implementing 
high-level  languages  was  to  allow  a  program  written  for  one  machine  to 
run  on  other  machines,  I  think  that  this  problem  reflects  a  serious 
shortcoming  of  languages  like  Fortran  and  Algol.  Such  languages  should 
provide  standard  functions  which  return  information  pertinent  to  the  machine. 


However,  given  these  shortcomings,  it  is  reasonable  to  ask:  How  can 
information  about  the  number  system  of  a  computer  be  determined  auto¬ 
matically?  That  is,  can  a  subroutine  written  in  Fortran  compute  this 
information? 

The  Fortran  subroutine  given  in  the  next  section  partially  solves 
the  problem  for  a  large  class  of  floating-point  systems ,  Another 
Fortran  subroutine,  presented  in  Section  solves  the  same  problem 
for  a  more  restricted  set  of  floating-point  systems. 

2.  The  Fortran  Subroutine  ENVRON 

For  the  remainder  of  this  paper,  a  floating-point  number  system  F 
will  be  characterized  as  follows:  Each  number  will  have  a  radix  (3 
and  a  t-digit  mantissa  where  t  >  1  .  Usually  0  is  2  ,  8  ,  10  or  16  ,  but  £ 
will  only  be  restricted  to  be  a  positive  integer  greater  than  1  .  The 
exponent  e  is  assumed  to  lie  in  the  range 
m  <  e  <  M  , 

where  m  <  0  and  M  >  t  .  Each  nonzero  xeF  has  the  representation 
x  =  +  .d1d2...dt-pe  , 

where  d^, ...,d^  are  integers  satisfying 

0  <  d^  <  p-1  ,  (i  =  1, . . ., t)  . 

The  number  0  belongs  to  F  .  No  assumption  is  made  about  the  representation 
of  0  ;  however  it  is  usually  represented  by 

0  =  +  .oo...o*pm 

If  x  f  0  and  d^  ^  0  ,  then  x  is  said  to  be  normalized.  All 
floating-point  operations  (e.g.,  addition  and  multiplication)  are 
assumed  to  result  in  either  0  or  a  normalized  floating-point  number. 
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The  machine  will  do  either  proper  rounding  or  chopping  (truncation) . 

The  machine  epsilon  mentioned  in  the  previous  section  is  the 
smallest  positive  floating-point  number  e  such  that  e  $1  >  1  ,  where 
©  denotes  floating-point  addition.  Thus,  one  could  compute  e  from 
0  and  t  . 

The  Fortran  subroutine  shown  in  Figure  1  can  be  called  with  the 
Fortran  statement 

CALL  ENVRON  (IB,  IT,  IB) 

If  the  Fortran  program  is  running  on  a  machine  with  a  floating-point 
number  system  of  the  type  just  described,  then  the  actual  parameters 


will  be  returned  with  the  values 


IB  =  0  , 
IT  =  t  , 


if  the  machine  does  chopping, 
if  .e  machine  does  proper  rounding. 


1 

2 

3 

H 

5 

6 

7 

8 

9 

10 

11 

12 

13 

lU 

15 

16 

17 

18 

19 

20 


SUBROUTINE  ENVRON ( BETA, T,RND) 
INTEGER  BETA,  T,  RND 
RND  =  1 
A  =  2. 

B  =  2. 

100  IF  ((A+l.)-A.NE.l.)  GO  TO  200 
A  =  2.*A 
GO  TO  100 

200  IF  (A+B.NE.A)  GO  TO  300 
B  =  2.*B 
GO  TO  200 

300  BETA  =  (A+B)  -  A 

IF  (A+(BETA-1) .EQ.A)  RND  =  0 
T  =  0 
A  =  1 

lOO  T  =  T+l 

A  =  A*BETA 

IF  (  (A+l)  -A.B3,  .1)  GO  TO  100 

RETURN 

END 

Figure  1 
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Suppose  the  machine  on  which  ENVRON  is  executing  has  the 


floating-point  system  F  .  Then  the  consecutive  integers 

0,1,2, .. 

can  be  represented  exactly  in  F  .  Integers  larger  than  0^  which 
can  be  represented  exactly  are 

ff+fi,  0t+20,  0*+30,  ...,  0t+1,  Pt+1+P2,  ... 


Thus,  the  difference  between  neighboring  floating-point  numbers  in  the 
interval  [0t,0t+1]  is  0  .  The  first  part  of  ENVRON  (lines  4  through  8) 
tests  successive  powers  of  2  until  a  floating-point  number  (A)  in 
this  interval  is  found.  Lines  9  through  12  add  successive  powers  of  2 
to  A  until  the  next  floating-point  number  (A--0)  is  found  and  then  0 
is  computed  by  subtracting  chese  two  numbers.  To  determine  whether  rounding 

or  chopping  is  being  done  (line  13),  0-1  is  added  to  A  .  Now,  since  A 

is  in  the  interval 


0  <  A  <  0 


t+1 


} 


the  number  t  can  be  computed  by 
t  =  l_logp  Aj  . 

However,  possible  inaccuracies  in  computing  the  logarithm  are  avoided  by 
determining  the  power  of  0  required  to  shift  the  least  significant  digit 
of  an  integer  out  of  the  mantissa.  The  smallest  such  exponent  is  equal 
to  t  . 

The  time  required  for  ENVRON  to  execute  is  roughly  proportional  to 
t 

log2  0  .  For  any  practical  application,  the  execution  time  is  negligible. 

It  is  important  to  note  that  the  algorithm  used  in  ENVRCN  does  not 
rely  upon  the  use  of  guard  digits  in  the  floating-point  additions. 
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The  author  believes  this  algorithm  to  be  a  very  efficient  way  of 
computing  0  }  t  and  whether  the  floating-point  system  rounds  or  chops 


3*  A  Special  Algorithm  for  the  Cases  0  =  2,  4,  8,  10  and  16 

After  using  the  technique  described  in  the  previous  section  to 
determine  the  number  Ae  [p^p^+^)  ,  the  following  trick  can  be  used 
to  determine  both  0  and  whether  rounding  or  chopping  is  done: 


1.  Set  B:=A+15  (H  is  another  floating-point  representation). 

2.  If  B=A  ,  then  0=l6  and  chopping  is  done. 

3.  If  B=A+8  ,  then  0=8  and  chopping  is  done. 

4.  If  B=A+10  ,  then  0=10  and  chopping  is  done. 

5.  If  B=A+12  ,  then  0=4  and  chopping  is  done. 

6.  If  B=A+l4  ,  then  0=2  and  chopping  is  done. 

7.  If  B=A+l6  ,  then  rounding  iB  done  and  0  is  either  2,  4,  8 

8.  If  B=A+20  ,  then  0=10  and  rounding  iB  done. 


The  case  B=A+l6  can  be  resolved  by 


1. 

Set 

r  =a+5. 

2. 

If 

B=A  ,  then  0=l6  , 

3. 

If 

B=A+4  ,  then  0=4  . 

4. 

If 

B=A+6  ,  then  0=2  . 

5. 

If 

B=A+8  ,  then  0=8  . 

or  16. 
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A  Fortran  subroutine  incorporating  this  idea  and  using  the  same 
name  and  calling  sequence  as  the  subroutine  given  in  Section  2  is  shown 
in  Figure  2.  Although  the  code  is  longer  than  the  version  in  Figure  1, 
the  execution  time  for  this  version  is  slightly  smaller. 


SUBROUTINE  ENVRON(BETA,T,RND) 

INTEGER  BETA,  T,  RND 

THIS  VERSION  WORKS  FOR  MACHINES  WITH  BASE  2,  4,  8,  10  OR  16 

RND  =  0 
A  =  2. 

10  IF  (  (A+l.)  -A.NF..1.)  GO  TO  20 
A  =  2.*A 
GO  TO  10 

20  1  =  IFIX((A+15.)-A)  +  1 

GO  TO  (50, 1,1, 1,1, 1,1, 1,40,1,  50, 1,60,1,70, 1,80,1, 1,1,  90),  I 
1  STOP 

30  BETA  =  16  - 

GO  TO  100 
40  BETA  =  8 
GO  TO  100 
50  BETA  =  10 
GO  TO  IDO 
60  BETA  =  4 
GO  TO  100 
70  BETA  =  2 
GO  TO  100 
80  RND  =  1 

I  =  ifix((a+5.)-a)  +  1 
GO  TO  (82, 1,1, 1,84, 1,86, 1,88), I 
82  BETA  =  16 
GO  TO  100 
84  BETA  =  4 
GO  TO  100 
86  BETA  =  2 
GO  TO  100 
88  BETA  =  8 
GO  TO  100 
90  RND  =  1 
BETA  =  10 
100  T  =  0 
A  =  1 

110  T  =  T+l 
A  =  A*BETA 

IF  ((A+l) -A. 0^.1)  GO  TO  no 

RETURN 

END 


Figure  2 
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H.  Conclusions 


The  algorithms  given  in  Figures  1  and  2  will  determine  certain 
characteristics  of  the  floating-point  number  system  of  any  machine 
currently  in  use  (at  least  those  floating-point  machines  of  which  the 
author  is  aware) .  Specifically,  the  number  base,  number  of  digits  and 
whether  rounding  or  chopping  is  done,  can  be  computed  automatically. 

Programs  and  subroutines  in  general  use,  such  as  library  routines, 
should  avoid  additional  parameters  in  the  calling  sequence  and  magic 
constants  in  the  code  by  using  one  of  these  subroutines  for  computing 
the  floating-point  environment  of  the  current  machine.  This  not  only 
makes  the  code  more  readable  but  the  portability  of  the  program  is 
greatly  increased.  The  additional  execution  time  required  to  cell  such 
a  routine  is  insignificant  compared  to  these  advantages. 

Unfortunately,  it  is  not  possible  to  write  a  general  subroutine  to 
compute  upper  and  lower  bounds  for  the  float ing -po int  exponent 
(m  and  M)  .  If  underflow  and  overflow  conditions  were  handled  in  some 
uniform  manner,  it  would  be  possible  to  do  so.  Sane  programs  make  use 
of  the  values  of  m  and  M  .  Thus  it  would  be  worthwhile  for  software 
manufacturers  to  consider  ways  of  providing  such  information  automatically. 
Automatic  determination  of  properties  of  the  integer  arithmetic  system 
would  also  be  useful.  A  good  universal  randan  number  generator  could 
be  written  if  it  were  possible  to  automatically  determine  the  magnitude 
of  the  largest  representablf  integer. 

Other  desirable  environmental  parameters  are  listed  in  a  paper 
by  Redish  and  Ward. 


7 


Acknowledgment 

The  author  would  like  to  thank  Professor  Cleve  Moler  for  arousing 
his  interest  in  this  problem.  Mr.  Richard  Sites  contributed  some  of 
the  ideas  which  led  to  an  earlier  version  of  ENVRON.  The  author 
would  like  to  thank  Professor  Robert  Floyd  for  questioning  the 
"optimality”  of  this  earlier  version  and  for  a  discussion  which  led 
to  improved  versions. 


6.  Bibliography 

Redish,  K.  A.  and  Ward,  W.,  "Environment  Enquiries  for  Numerical 
Analysis",  SIGNUM  Newsletter  6  (l),  January  1971,  10-15 . 


